Anytime Learning of Decision Trees

نویسندگان

  • Saher Esmeir
  • Shaul Markovitch
چکیده

The majority of existing algorithms for learning decision trees are greedy—a tree is induced topdown, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Even the few non-greedy learners cannot learn good trees when the concept is difficult. Furthermore, they require a fixed amount of time and are not able to generate a better tree if additional time is available. We introduce a framework for anytime induction of decision trees that overcomes these problems by trading computation speed for better tree quality. Our proposed family of algorithms employs a novel strategy for evaluating candidate splits. A biased sampling of the space of consistent trees rooted at an attribute is used to estimate the size of the minimal tree under that attribute, and an attribute with the smallest expected tree is selected. We present two types of anytime induction algorithms: a contract algorithm that determines the sample size on the basis of a pre-given allocation of time, and an interruptible algorithm that starts with a greedy tree and continuously improves subtrees by additional sampling. Experimental results indicate that, for several hard concepts, our proposed approach exhibits good anytime behavior and yields significantly better decision trees when more time is available.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

C ONSTRAINT - B ASED M INING AND L EARNING AT ECML / PKDD 2007 CMILE ’ 07 September 21 , 2007 Warsaw , Poland

Machine learning techniques are increasingly being used to produce a wide-range of classifiers for complex real-world applications that involve different constraints both on the resources allocated for the learning process and on the resources used by the induced model for future classification. As the complexity of these applications grows, the management of these resources becomes a challengi...

متن کامل

Anytime Induction of Decision Trees: An Iterative Improvement Approach

Most existing decision tree inducers are very fast due to their greedy approach. In many real-life applications, however, we are willing to allocate more time to get better decision trees. Our recently introduced LSID3 contract anytime algorithm allows computation speed to be traded for better tree quality. As a contract algorithm, LSID3 must be allocated its resources a priori, which is not al...

متن کامل

When a Decision Tree Learner Has Plenty of Time

The majority of the existing algorithms for learning decision trees are greedy—a tree is induced top-down, making locally optimal decisions at each node. In most cases, however, the constructed tree is not globally optimal. Furthermore, the greedy algorithms require a fixed amount of time and are not able to generate a better tree if additional time is available. To overcome this problem, we pr...

متن کامل

Anytime Induction of Cost-sensitive Trees

Machine learning techniques are increasingly being used to produce a wide-range of classifiers for complex real-world applications that involve nonuniform testing costs and misclassification costs. As the complexity of these applications grows, the management of resources during the learning and classification processes becomes a challenging task. In this work we introduce ACT (Anytime Cost-sen...

متن کامل

Two Heuristic Functions for Decision

This paper investigates a different foundation for decision theory in which successive model refinement is central. The idea is to modify utility so that it can sometimes be calculated for an outcome without considering all the relevant properties that can be proved of the outcome, and without considering the utilities of its children. We build partially ordered heuristic utility functions. We ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2007